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Abstract 

We prove that in the Hopfield model where the ratio of patterns to sites grows large, the free 
energy behaves like the free energy in the Sherrington-Kirkpatrick model. 



1 Introduction 



The Hopfield model is the system of configurations x G {±1}^ governed by the Gibbs measure 

where Z{/3, B) ~ Q-0H{x)+BJ2i jg a normalising constant to make G a probability mea- 
sure, /? and B are constants representing the inverse temperature and external field respectively, 
and H is the Hamiltonian 

M N N , M . 

' k=l * 1=1 ]=1 ^ ' fe=l ' 

The special configurations G M^, 1 < fc < M, are called patterns for the reason that the 
Gibbs measure tends to be concentrated near them, and thus these patterns are memorised in 
some sense (Hopfield, 1982). The patterns themselves are random, making the Gibbs measure 
a random measure; usually they are taken to be vectors of independent Bernoulli variables, 
but we will also consider more general patterns. 

Our Hamiltonian is normalised by l/V NM rather than the usual 1/iV. This is the correct 
normalisation as a = M/N — )■ oo; observe that if l/v^M were replaced by in (2), 

then the expression in parentheses would diverge as a — > oo, resulting in a degenerate Gibbs 
measure. As a result of this normalisation, our inverse temperature /3 differs from the usual 
inverse temperature by a factor of ^/a. 

An important quantity associated with the Gibbs measure is the free energy 

F(/3, i?) = 1 log Z{/3, 5)^1 log e-^^(-)+^ ^' . (3) 



The significance of the free energy is that expectations with respect to the Gibbs measure 
contain a factor of 1/Z, so many useful quantities appear as derivatives of F. 



A related model is the Sherrington-Kirkpatrick (SK) model, with Hamiltonian 

N N 
i=li = l 

The Jij represent ferromagnetic interactions between sites i and j , and are usually taken to be 
independent standard Gaussain random variables. In this case, Guerra and Toninelli (2002) 
proved that the SK free energy converges to a limit (with exponentially high probability). The 
value of the limit was conjectured by Parisi (1980) and proved by Guerra (2003) and Talagrand 
(2006), although it is difficult to compute. Carmona and Hu (2003) showed that the limit holds 
for any interactions which are independent (not necessarily identically distributed) with mean 
0, variance 1 and bounded third moment. 

Our main result is that the Hopfield free energy, when appropriately normalised, converges to 
this same limit as the ratio a = M/N of patterns to sites grows large. We require the patterns 
to be symmetric, independent and identically distributed, with unit variance and bounded 
eleventh moment. Some generalisations which are not difhcult to prove include replacing 
the symmetric requirement with normal concentration, removing the identically distributed 
requirement or including more general external fields, but we will not discuss them further. 

Theorem 1. If a > 5/3^, then for some universal constant C, 
E[F^;^(Ai3;0] -£[^^^(72/3,5; J)] - /J^S 

It is known (Talagrand, 1996) that the Hopfield free energy concentrates for Bernoulli patterns 
as TV — ^ oo] we give a proof for more general patterns in Theorem 8. This gives the following 
behaviour of the limiting free energy with respect to a, which is confirmed in Figure 1. 

Corollary 2. Let P{I3,B) ~ fim F^^ {(3, B; J) be the limiting SK free energy. Then, 

hm F^''Pj,{l3,B;0=l3V^ + P{V2(3,B)+O{l3yV^). (6) 

The Hopfield model with large a is not well-studied. The region a — )■ is the best understood 
(Shchcrbina and Tirozzi, 1993; Bovier and Gayrard, 1997), and the problem becomes consid- 
erably more difficult as soon as lim sup a > (Bovier, Gayrard and Picco, 1995; Talagrand, 
1998, 2000). The author is not aware of any known results for unbounded a. 

Heuristically, one can guess that the Hopfield model behaves like the SK model for large a, 
since the interaction between sites i and j is J^k ^i^j' which converges to Gaussian when 
the limits a oo and N —i' oo are swapped. However, the Hopfield interactions are certainly 
not independent, nor is it clear that we can exchange these limits. 

Theorem 1 shows that the order of the limits does not matter, while Lemma 5 shows that the 
dependence between interactions exactly corresponds to the term in Theorem 1. In fact, 
this term comes entirely from self-interactions when i — j] if we restrict our Hamiltonians 
to i < j, then F"°p{P,B) - F^^{l3,B) with no l3^/a term. Note that the \/2 factor in the 
inverse temperature also disappears since it arises from the fact that the interactions Jij and 
Jji are independent in the SK model but equal in the Hopfield model. 



^^ + 0^. (5) 
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Our paper is similar in structure to the universality result of Carniona and Hu (2003) for the 
SK model, but their bounds do not carry over directly to the Hopfield model. The problem 
is that the Hopfield Hamiltonian is quadratic in the patterns while the SK Hamiltonian is 
linear in the interactions. We solve this problem by bounding various quantities of interest 
by moments of the overlap S = {x ■ S})/\/N, which are evaluated in Theorem 4 using the 
symmetry trick of Lemma 3. This is similar to Theorem 2.2 in Talagrand (2000), but applies 
to a different parameter region and has a much more elementary proof. 




10 20 30 40 50 

Figure 1. Plot of free energy (vertical axis) against the parameter a 
(horizontal axis), for P — 1, B — Q (bottom plot) and (3 — 2, B = h 
(top plot). The points are realisations of the Hopfield free energy with 
= 50 and a = 1,2,..., 50. The curves are /3^/^ + P{V^l3, B), 
estimated by averaging 100 realisations of Fi^{V2P, B). 
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2 Main Result 



For < i < 1, define the interpolated Hamiltonian 

KM^'^ ^' J) = \/r^V27?f^(x; J) + ViH^Jjix; ^ + N^t 



(7) 



E 



Also define the corresponding partition function Z = ^ PH{x)^BY,iXi ^ Gibbs measure 
G{x) = ^-m(^)+BY.^x, j.^^ f^gg energy F = ^logZ. Let (/) = E,; denote 
expectation with respect to the Gibbs measure. For convenience, we will often drop implied 
dependences; we refer to the interpolated quantities unless explicitly stated otherwise. 

Assume the SK interactions J are independent standard Gaussian, and the Hopfield patterns ^ 
are symmetric, independent and identically distributed with unit variance; we will state more 
assumptions when they are required. We begin by giving a bound on moments of the overlap 
S = (x ■ S})/\/N — Xiii^ which will be crucial to the remainder of the paper. 

Lemma 3. Suppose ^ > (2 + 3e)/3 for some e> 0, and let {\\£,^\\2 > (1 + Then, 
for c > and C < oo depending only on e, and all r > 0, 



E 



G{S^ > r)l. 



< Ce' 



(9) 



Proof. Let Ea = {u < S'^ < u + 1}, considered as either a set of configurations or a set of 
patterns depending on context. Let H* = H ^ \/tS'^ j \/ol be the Hamiltonian with the ^ term 
removed, and let G* be the corresponding Gibbs measure with expectation (•)*. Then, 



G{S' > r) =Y^G{E^) = Y^ 



(10) 



U—r 
oo 



-r L^x 

-liH'{x)+BY.Xi oo 
< E ^'^""^^'^ y:-,n>,x,.BY.^. = E ^'^""^^^^ cm). (11) 

u—r i~dx u—r 

Since G* does not depend on by Tonelli's theorem, we can exchange E^i and (•)* to obtain 



E 



G{S^ > r) l^ej < J2 e''^"+'^/^ E [(Pe(i^« n A 



(12) 



Since replacing with Xi£,l changes neither the distribution of 5* nor the value of ||C^||2, 
F^i{Eu n A^) does not depend on x. Then, by Hoeffding's (1963) inequality. 



(13) 





\\Eu\. 1 






E 


2Ar 


< E 


2e 



Since A'' ^ < {1 + €)N}, this is bounded by 2e^"/(2+20^ hence 

[G(5^>.)...]<2f;exp(«l^" " 



E 



2 + 2e 



Our assumption ^/a > (2 + 3e)/3 implies the required bound. 



(14) 

□ 
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Theorem 4. Suppose ^/a > (2 + 3e)/3, and the patterns ^ have hounded mth moment for some 
m > 2. For any d such that < d < (m — 2)(1 — d/m). there is a constant Cd depending only 
on e aWE[|Cj|"] such that E[{\S\'^)] < C'd- 

Proof. Let A^{\\^^\\j> (l + e)iV}. Then, 

E[{\Sf)] ^E[{\Sf)lA\ +E[{\Sf)lA^] 



<E[{\Sr)]''^'>[A] 



I 1 — d/m 



E 



dr. 



(15) 
(16) 



By Lemma 3, this integral is bounded. Using the trivial bound E[(|S'|")] < iV"/2E|^|^i|™] ^ 
since (m - 2)(1 - d/m) > d, it suffices to prove ¥[A] = 0(iV-("-2)/2(iog7v)'"). 

Let /(r) — r™P[|^|t| > r] . We claim that there exists c such that for any r > 0, the interval 
[r,cr] contains a point u with f{u) < 1. We prove this by contradiction; if the claim is false, 
then for any c, there is Tc such that / > 1 on [rc,crc], and we arrive at the contradiction 



E[\ar] ^ 



^P[\^l\ >r]dr = 



fir) 



dr > 



1 



dr = log c. 



(17) 



Thus, we can pickrjv such that y/N/logN < rjq < c//V/logiV and P[|C^| > rjv] < r^". Let 
B = {yi\^l\<rN}. Then, 



'[B^] < NP[\^l \ > tat] < TVr]^™ < N 



l-m/2 



(logiV)^ 



Finally, by Hoeffding's (1963) inequality, P[Bn74] < 2e-'^/2''« < 2e-<^(i°sA')V2c" 



(18) 



□ 



The next two lemmas are formulae for evaluating quantities that will be important later, while 
Lemma 7 is a universality result establishing that the distribution of patterns does not matter, 
thus allowing us to apply Stein's lemma as if they were Gaussian. 



Lemma 5. The interpolated mean free energy has derivative 



dt 



v/2(l-i)iV 



2VtNM 



Proof. We have 
d 



dt 



E[F] = E 





^-Ie 


[(f)] 




N 





(19) 



(20) 



Next, we differentiate (8) and split the sum over and i and j into two sums where i ^ j and 
i = j respectively. The i ^ j terms give (19) upon taking i = 1, j = 2 and k — \ without 
loss of generality. For the i = j terms, the SK term is a multiple of E[Jii(a;2)] — 0, while the 
Hopfield term exactly cancels the derivative of N^/ai. □ 



Lemma 6. For any function g — g{x; J), and S ^ {x ■ £_^)/\/N 
Proof. Explicit derivative calculation. 



(21) 

□ 
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Lemma 7. Suppose a > (4 + e)/3^ for some e > 0, and the patterns ^ have bounded mth 
moment for m — 11. Then, for some constant C depending only on e and ]E[|^||^^] , 



d d 
MW2 



< 



(22) 



Proof. Define the probabilistic function / : — )■ R by / = {X1X2) considered as a function 
oi y = £\ and z = with the other regarded as random parameters. Since / is infinitely 
differentiable, we can define its second degree Taylor expansion T around 0, that is, 

T{v, z) = /(0, 0) + 2//,(0, 0) + z/,(0, 0) + ^yV,y(0, 0) + \z^fAO. 0) + 2/2/,.(0, 0). (23) 

Since the evaluated derivatives depend only on the other than y or z, which are independent 
with mean and variance 1, an easy calculation yields V.\yzT{y, z)] — E[/y2(0,0)] , hence 



E 



Vzf[y,z) - fy^{y,z)\^ = ¥.\^z{f - T){y,z) - [fy,iy,z) - fy,X^,Q)) . 
By Taylor's theorem, 

if-T){y,z) = [\y^ fyyy{sy,sz) + ^y'^ zfyy,{sy, sz) 

+ ^yz'^fyzzisy, sz) + \z^fzzz{sy, sz)^ (1 - sfds] 
fyz{v,z) - fyz{Q,Q) = / {yfyyzisy,sz) + zfyzzisy,sz)'jds. 



(24) 



(25) 
(26) 



By symmetry of f{y,z) = f{z,y), 

^[yzf{y,z) - fyz{y,z)] ^ / (1- s'^)E[y^zfyyy{sy,sz)] 

Jo 

+ E[(3(1 - s)^y^z^ + 2y)fyyz{sy, sz)]ds. 
Let X — X1X2 and U = xiS. By Lemma 6, 



fyy = (^)'((XC/2) - 2{XU){U) {X){U^)+2{X){U?y 



fyyy ~ 



(XU^) - 3{XU^) (U) - 3{XU) + 6{XU) {Uf 
^{X){U')+6{X){U'){U)-6{X){Uf 



(27) 

(28) 
(29) 

(30) 



By Holder's inequality, \fyyy\ < 8/3^M ^/^26(|5p). Let be expectation with respect to 
the Gibbs measure with £,1 and ^2 replaced by s^l and respectively. Then, 



^[y'^Zfyyy{sy,SZ)] 



< ^^26E 

- M3/2 



\y\ \z\(\S - (l - s){xiy + X2z)\ 



(31) 
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Thus, we need to bound E[(|j/|''+"|z|i+^|S'|'=)''] , where a + b+c = 3. By Holder's inequahty, this 
is bounded by E[\y\^'^+''W{d~c)\^\{i+b)d/{d-c)Y~c/d^(^^g^d'^Y/d^ ^^^^.^ {A + a)d / {d - c) < m 

and c < d < (to — 2)(1 — d/m). By Theorem 4, which apphes to (•)'* since scahng finitely 
many patterns does not affect the proof, this yields E[y'^zfyyy{sy, sz)] — 0(/3'^/Af ^/^) ; some 
algebra shows we need to > 6 + \/22 w 10.7. 

Bounding the second expectation in (27) is similar. Using Lemma 6 to differentiate (29) with 
respect to z and applying Holder's inequality gives 



fvv^ — 



4/32 



mVn 



8(1^1) 



M3/2 



26{\S\'). 



(32) 



Scaling and ^2 by s, multiplying by 3(1 — s)'^y^z'^ + 2y and taking expectations, the same 
argument gives the bound 0{j3'^ /M\/N), noting that j3/y/a < \. □ 

Theorem 1. Suppose the patterns ^ have hounded eleventh moment. If a > (4 + e)/?^ for 
some e > 0, then for some constant C depending only on e, 



E [F^J,(/3, S; 0] - E[i^|^(%/2/3, J)] - /?■ 
Proof. Observe that 

F^°^(/3, B; C) - F^''{V2f3, B;J)-PV^ = J^ jE[F*,^^j{/3, B; ^, J)]dt 
By Lemma 5, it suffices to prove 



v/2(l-t)iV ^ ^ 2VtNM 

Let X = X1X2 and Var(X) = (X^) - (X)^. By Stein's lemma, 



(33) 



(34) 



(35) 



E[ji2{xiX2)] =E 



N 



E[Var(X)]. (36) 



Let U ~ xiS and V — X2S. By Lemmas 6 and 7 applied to (28), 

E[ae2{^iX2)] = -j^E[{XUV) - {XU){V) - {XV){U) - {X){UV)+2{X){U){V) 



2p^ft 

Vnm 



E[Var(X)] +0 



(37) 



By Holder's inequality and Theorem 4, the first line is O{0^ /M), which gives the Cfi^ j ^fa. 
term. In the second line, the first term exactly cancels (36), while the second term gives the 
O{^0^ I \fM) error. It is clear that C depends only on e and E[|^J|^^] ; the moment dependence 
only arises in the N^^°s^ term in Theorem 4 and does not affect the dominant term. □ 
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Finally, we prove the concentration that allows us to conclude Corollary 2 from Theorem 1. 

Theorem 8. Suppose a > (4 + e)/?^ for some e > 0, and the patterns ^ have bounded mth 
moment. If A < 2d < (m — 2)(1 — 2d/m), then for some C depending only on e and £[1^11™] , 



E 



< 



(38) 



Proof. Wc follow the proof of Carmona and Hu (2003) for the SK model, using our bound 
from Theorem 4. Since we are only dealing with the Hopfield free energy here, symbols in this 
proof will refer to the Hopfield quantities rather than the interpolated ones. 

Let be the cr-algebra generated by 5^, . . . , and let Di — iV(E[log Z\F(\ — E[log Z\Fi-i\). 
The Di are martingale differences with 'Y^^Di — N{F — E[F]), so by Burkholder's (1973) 
martingale inequality and convexity of x t-^ x'^^'^, for some universal constant Cd, 



E[\F-E[Fr]^^^E\\Y^D. 



d/2' 



< 



Y.^[m% (39) 



Let fi' — /S/V NM. Observe that for any fixed 1 < ^ < M, we can write 



log Z = log ^ e^' ^.M-<'r+''^. - log ^ 

X 

^ log ^ e^' 5:. _ log 



(40) 
(41) 



Since the first term above is independent of it follows that 



Dg = -E|^log(e 
By Holder's inequality, 

E[|i:if|''] < 2''E 



Ff, 



E 



log ( e 



log(e 



log(e-'3^'/^) 



(42) 



(43) 



The function (j>{z) = \ logzj'' has second derivative (j>"{z) = {d{d—l) llogzl"* ^~d llogzl*^ ^) /z^, 
so (p is convex for < z < e''^^. Since < e^^ < 1 < e''^^, by Jensen's inequality, 



This is 0{/3'^/a'^^^) by Theorem 4. 



log (e 



2''E 



2.d 



(44) 

□ 
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